Akka

Akka is a toolkit, for distribution, state, protocols

Actors are simple high level abstractions for async message-driven programming

Akka cluster has good testing, for mocking requests and testing faults

Wire protocol

Abstractly a way of getting data from point to point, often for distributed object protocols

Spark

General purpose cluter computing system. Provides an API centered on a data structured called the resilient distributed dataset

Why Spark & Akka?

Spark is supposed to be stateless. Spark sends data out to get processed. Akka cluster gives us a separate asynchronous control channel

Data Ingestion

Handled by a hash to nodes

One actor holds state

Backpressure?

Monix/Reactive Streams?

Kamon Tracing - builtin for Akka actors

Scalactic

Scala optimization (Machine speed)

Dont' serialize, don't allocate, don't copy

Binary data: cannot rely on static types and standard serialization mechanisms (protobuff)